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To better understand the genetic diversity and genomic features of 41 coronaviruses (CoVs) identified 
from Kenya bats in 2006, seven CoVs as representatives of seven different phylogenetic groups identified 
from partial polymerase gene sequences, were subjected to extensive genomic sequencing. As a result, 
15-16 kb nucleotide sequences encoding complete RNA dependent RNA polymerase, spike, envelope, 
membrane, and nucleocapsid proteins plus other open reading frames (ORFs) were generated. Sequences 
analysis confirmed that the CoVs from Kenya bats are divergent members of Alphacoronavirus and Beta- 


ed com coronavirus genera. Furthermore, the CoVs BtKY22, BtKY41, and BtkKY43 in Alphacoronavirus genus and 
Nawal BtKY24 in Betacoronavirus genus are likely representatives of 4 novel CoV species. BtKY27 and BtKY33 


Genome sequence are members of the established bat CoV species in Alphacoronavirus genus and BtKYO6 is a member of the 
Bat established bat CoV species in Betacoronavirus genus. The genome organization of these seven CoVs is 
Kenya similar to other known CoVs from the same groups except for differences in the number of putative ORFs 
following the N gene. The present results confirm a significant diversity of CoVs circulating in Kenya bats. 
These Kenya bat CoVs are phylogenetically distant from any previously described human and animal 
CoVs. However, because of the examples of host switching among CoVs after relatively minor sequence 
changes in S1 domain of spike protein, a further surveillance in animal reservoirs and understanding the 
interface between host susceptibility is critical for predicting and preventing the potential threat of bat 

CoVs to public health. 
Published by Elsevier B.V. 


1. Introduction 


Coronaviruses (CoVs) are large, enveloped viruses contain- 
ing linear, positive-sense, single-stranded RNA genomes. Their 
genomes range approximately from 27- to 32-kb in length and con- 
tain 7-14 open reading frames (ORFs) (Woo et al., 2009a). Six major 
ORFs encoding polymerase complex (ORFla and ORF1b), spike gly- 
coprotein (S), envelope protein (E), membrane glycoprotein (M), 
and nucleocapsid protein (N) are present in all CoVs (Poon et al., 
2005). In addition, up to seven putative accessory ORFs and one 
ORF encoding hemagglutinin-esterase glycoprotein (HE) are inter- 
spersed between the six major ORFs. The numbers and sizes of these 
accessory ORFs differ markedly among CoVs (Woo et al., 2009a). 

CoVs have been identified from a broad range of birds and 
mammals including humans in which they can cause respiratory, 
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enteric, hepatic and neurologic diseases of varying severity (Weiss 
and Navas-Martin, 2005). CoVs in the subfamily Coronavirinae are 
classified into three genera, Alphacoronavirus, Betacoronavirus, and 
Gammacoronavirus (former serogroups 1-3) (de Groot et al., 2011). 
Alpha- and beta-coronaviruses have been exclusively isolated from 
mammals and majority of gamma-coronaviruses from birds. CoVs 
of a distinctive lineage were recently detected from birds and pigs 
(Chu et al., 2011; Woo et al., 2009b, 2012) and have been proposed 
to belong to a new genus, provisionally named Deltacoronavirus (de 
Groot et al., 2011). The finding that the outbreak of severe acute res- 
piratory syndrome (SARS) in early 2003 was caused by a novel CoV 
(SARS-CoV) has boosted interest in the search for novel CoVs in 
humans and animals. At least 30 previously unrecognized distinc- 
tive CoVs from human and various animal reservoirs were reported 
during recent years, including SARS-related CoVs and CoVs from 
all genera in the subfamily Coronavirinae which have significantly 
expanded our understanding of CoV diversity and complexity (Woo 
et al., 2009a). Based on available data, bats appear to harbor a great 
diversity of CoVs. The frequency and diversity of CoV detection 
in bats, now in multiple continents, suggest that bats are likely a 
source for CoV introduction into other species globally and possibly 
play an important role in the ecology and evolution of CoVs. 
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Recently we reported the identification of 41 divergent CoVs in 
bats from Kenya, based on limited ORF1b sequences (Tong et al., 
2009). These newly discovered bat CoVs were grouped into 8 dif- 
ferent phylogenetic clusters. Of these, five clusters belonged to 
previously identified Alphacoronavirus genus, and three clusters 
belonged to previously identified Betacoronavirus genus, includ- 
ing a SARS-related CoV lineage. In the present study, we expand 
our sequence data for seven CoVs, representing 7 of the 8 dis- 
tinctive clusters we identified in Kenya bats during 2006 summer 
(Tong et al., 2009). The sample representing the eighth cluster of a 
SARS-related CoV was a weak positive and had limited specimen 
amount, therefore further sequencing studies were not included 
in this analysis. The purpose of our study was to further charac- 
terize the genomes and refine the phylogenetic relationships of 
these seven CoVs with other CoVs, based on the ORFs 1b, S, E, M, 
and N. 


2. Materials and methods 
2.1. Bat sampling and RNA extraction 


Kenya was chosen as a major comparative Old World study 
location in Africa as part of the CDC Global Disease Detection 
program. Detailed information on bat capture and sampling is avail- 
able in the previous publication (Tong et al., 2009). The protocols 
for animal capture and use were approved by the CDC Animal 
Institutional Care and Use Committee and the Ethics and Animal 
Care and Use Committee of the Kenya Wildlife Service (Nairobi, 
Kenya). In brief, representative samples at each site were collected 
from bats of available species, including adult and juvenile of both 
sexes. After euthanasia, a complete necropsy was performed in 
compliance with the approved field protocols. Samples included 
blood, various organs (liver, lung, and kidney), rectal and oral 
swabs. 

In this study, seven CoV-positive rectal swabs were selected 
as representatives of the seven different phylogenetic groups 
(Tong et al. 2009) for extensive genome sequencing. These 


are Rousettus bat coronavirus/Kenya/KY06/2006 (BtKYO6), 
Chaerephon bat  coronavirus/Kenya/KY22/2006 (BtKY22), 
Eidolon bat coronavirus/Kenya/KY24/2006 (BtKY24), Min- 


iopterus bat coronavirus/Kenya/KY27/2006 (BtKY27), Miniopterus 
bat coronavirus/Kenya/KY33/2006 (BtKY33), Chaerephon bat 
coronavirus/Kenya/KY41/2006 (BtKY41), and  Cardioderma 
bat coronavirus/Kenya/KY43/2006 (BtKY43). BtkKY43 was not 
described previously, but represents a group of 4 Kenya bat CoVs 
(BtkKY0O3, BtkKY12, BtkKY13, and BtKY29) (Tong et al., 2009). Total 
nucleic acids (TNA) were extracted by using the QI[Aamp MinElute 
Virus Spin Kit (Qiagen, Santa Clarita, CA) according to the man- 
ufacturer’s instructions from 200 wl of phosphate buffered saline 
suspension of the rectal swab and homogenized organ tissues 
(liver, lung, and/or kidney) of each bat except for bats BtKY33 
and BtKY43 whose organ tissues were not available. The TNA was 
eluted in 80 wl DEPC-treated water and then stored at —80°C. 


2.2. Reverse transcription-PCR (RT-PCR) 


Each CoV-positive result on the rectal swab included in this 
study was repeated from different TNA aliquots. The presence of 
CoV RNA in organ tissues of these bats was determined using 
the pan CoV RT-PCR assays as described previously (Tong et al., 
2009) and the sequence specific and/or group specific CoV RT-PCR 
assays (Table S1). The RT-PCR were performed as described pre- 
viously (Tong et al., 2009). Standard precautions were taken to 
avoid cross-contamination of samples before and after RNA extrac- 
tion and amplification. Purified DNA amplicons were sequenced 


with the RT-PCR primers on an ABI Prism 3130 automated capil- 
lary sequencer using a BigDye Terminator v3.1 Cycle Sequencing 
kit (Applied Biosystems, Carlsbad, CA). 


2.3. Partial genome sequencing 


High throughput 454 pyrosequencing on CoV RNA-positive 
bat samples was initially attempted, but failed to acquire any 
CoV-associated reads due to lower sensitivity. Therefore the RT- 
PCR-amplicon sequencing by Sanger chain-termination method 
was chosen in this study. Each of the seven contiguous sequences 
was obtained by using 4-6 pairs of semi-nested or nested consen- 
sus degenerate group specific primers and 4-7 pairs of semi-nested 
or nested sequence-specific bridging primers which generated a 
series of 8-13 overlapping fragments covering 15-16kb genomic 
sequences at the 3’ end (Table S1 ). The other half genome sequence 
containing the ORFla, was not recovered in this analysis due to 
the limited amount of rectal swab samples. Consensus degenerate 
primers of each group were designed from conserved sequences 
of known members of the corresponding sequence group or its 
close group based on CODEHOP strategy (Rose et al., 1998). The 
3’ end of genome sequence was determined using the 3’ RACE kit 
(Roche, Indianapolis, IN) according to the manufacturer’s instruc- 
tions. Semi-nested or nested primers were used to improve the 
PCR sensitivity. When nested primers were not available, the PCR 
product was re-amplified using the same RT-PCR primers. The 
RT-PCR reactions were performed with SuperScript III one-step 
RT-PCR High Fidelity kit (Invitrogen, San Diego, CA) according to 
the manufacturer’s instructions, and the second round RCR reac- 
tions were performed with AccuPrime Taq DNA polymerase High 
Fidelity kit (Invitrogen, San Diego, CA). The RT-PCR products were 
visualized on 1% agarose gels containing 0.5 g/mL of ethidium 
bromide, and purified by QIAquick PCR purification kit (QIAGEN, 
Santa Clarita, CA). The RT-PCR amplicons for each sample were 
first sequenced with the consensus degenerate RT-PCR primers in 
both directions, and then the remaining internal gaps and 3’ end 
genome were sequenced with sequence-specific bridging primers 
in both directions as described previously. The genomic sequences 
(ORF1b, S, ORF3, E, M, and N) of BtKY22, BtKY33, BtKY27, BtKY41, 
BtkY43, BtkKYO6, and BtKY24 were deposited in NCBI GenBank 
(HQ728480-HQ728486). 


2.4. Sequence analysis 


Sequences were assembled in Sequencher (Genecodes, Ann 
Arbor, MI). Each putative ORF was predicted using the NCBI 
ORF finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). N- 
glycosylation sites were predicted using NetNGlyc 1.0 Server 
(http://www.cbs.dtu.dk/services/NetNGlyc/). BLAST analyses 
were performed against NCBI non-redundant protein database 
(Altschul et al., 1990) and against the Conserved Domain Database 
for protein classification (CDD) (Marchler-Bauer et al., 2005) to 
characterize the putative ORFs. 

Alignments of the seven Kenya bat CoV gene sequences with 
a representative set of 43 other CoV sequences, available in the 
public domain, were performed using the MUSCLE v3.6 (Edgar, 
2004). We constructed maximum likelihood trees for each gene 
alignment (ORF1b, S, E, M, and N) in MEGA software package 
v5.0 (Tamura et al., 2011) with 1000 bootstrap replications. We 
used General-Time-Reversible nucleotide (nt) substitution model 
with 4 categories of gamma distributed rate heterogeneity and 
a proportion of invariant sites (GTR+7y4+I). To identify potential 
recombination events of the seven Kenya bat CoVs, three methods 
implemented in recombination detection program RDP version 2 
(Martin et al., 2005) were used, including MaxChi (Smith, 1992), 
Chimaera (Posada et al., 2002), and Geneconv (Padidam etal., 1999). 
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Fig. 1. Schematic representation of the genome organization of Kenya bat CoVs and representative alpha- and beta-coronaviruses. Shaded boxes represent open reading 


frames (ORFs) encoding structural proteins and unshaded boxes represent those encoding nonstructural proteins. 


Table 1 
Genomic features of open reading frames from seven bat coronaviruses and their putative transcription regulatory sequences (TRS). 


Genus 
Virus 


Sequences? (nt) 
ORF1a (nt) 
ORF1b (nt) 
S 
ORF size (nt) 
Putative TRS 
ORF3 
ORF size (nt) 
Putative TRS 
E 
ORF size (nt) 
Putative TRS 
M 
ORF size (nt) 
Putative TRS 
N 
ORF size (nt) 
Putative TRS 
ORFXx 
ORF size (nt) 
Putative TRS 
ORFy 
ORF size (nt) 
Putative TRS 


3’ UTR (nt, excluding poly A) 


4 Partial genome sequence starts from the first nt position in the RdRp to the end of genome. 
> NA, not available. 


Alphacoronavirus 


BtkY27 BtkY33 BtkY22 
15314 15908 15480 
NAP NAP NAP 
8022 8025 8025 
4128 4152 4071 
CUAAAU CUAAAU CUAAAU 
660 672 672 
CGUUAC CGUUAC CGUUAC 
225 225 225 
CUAUAC CUUUAC CUCUAC 
768 780 684 
CUAAAC CUAAAC CUAAAC 
1185 1296 1263 
CUAAAC CUAAAC CUAAAC 

486 291 

CAAAAU CUAAAC 
269 222 291 


Btky41 


15578 
NAP 
8025 


4161 
CGAAAU 


687 
CUAGAC 


231 
CUAGAC 


690 
CUAAAC 


1227 
CUAAAU 


264 
CUAAAU 


195 
CUAAAC 
222 


BtkY43 


15474 
NAP 
8022 


4095 
CUAAAU 


660 
CUAAAC 


243 
CUUUAC 


684 
CUAAAC 


1182 
CUAAAC 


288 
CUAAAC 


221 


Betacoronavirus 


BtkY24 BtkY06 
16186 16201 
NAP NAP 
8040 8067 
3795 3837 
ACGAAC ACGAAC 
717 663 
ACGAAC ACGAAC 
228 249 
UCGAAC UCGAAC 
666 669 
ACGAAC ACGAAC 
1404 1407 
ACGAAC ACGAAC 
567 558 
ACGAAC ACGAAC 
432 450 
ACGAAC ACGAAC 
231 217 
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Fig. 2. Multiple amino acid sequence alignments showing the putative $1-S2 junctional region of CoV spike protein. The identical amino acids are highlighted in black and 
the similar amino acids are highlighted in gray. The regions containing S1 GxCx motif, conserved S2 nonamer IPTNFSISI, the furin cleavage site (in MHV, HCoV OC43, and 


BCoV; underlined), and cathepsin L cleavage site (in SARS-CoV) are indicated. 


Events detected by all three methods with default parameters were 
considered as potential recombination events. 


3. Results and discussion 
3.1. Detection of CoV RNA in bat tissues 


The aliquots of bat rectal samples for BtKY27, BtKY33, BtKY22, 
BtkY41, BtkY43, BtkKY24, and BtKYO6 were confirmed positive by 
the pan CoV RT-PCR assay, while among tissues (liver, lung, and/or 
kidney) that were available from bats BtKY27, BtKY22, BtKY41, 
BtkY24, and BtKYO6, only the liver from bat BtKY22 (Chaerephon 
sp.) and the kidney from bat BtKY24 (Eidolon helvum) tested posi- 
tive by RT-PCR. These data support an infection process rather than 
transit of ingested infected material through the digestive tract as 
the source of viral RNA in rectal swabs, particularly because these 
bat species do not feed on vertebrates. Negative results for other 
tissues may be explained by specific pathobiology and a limited 
tropism to the available tissues. 


3.2. Partial genome sequence and organization 
Each acquired CoV genome sequence covers the complete 
ORF1b, S protein, ORF3, E protein, M protein, N protein, other puta- 


tive ORFs after N and the 3’ end untranslated region with a poly 
A tail. The genome organization and size for each of the ORFs are 


Table 2 


Pairwise sequence comparison of Kenya bat CoVs with their nearest known CoV species. 


shown in Fig. 1 and Table 1, respectively. They are similar to other 
known CoV genome organization in the order of 5’-ORF1b, S, ORF3, 
E, M, and N-3’, but have a variable number of putative ORFs down- 
stream of the N gene. The sizes of these seven genomic sequences 
from ORF 1b to the 3’ end are between ~15k and ~16k and their G+ C 
contents are between 37.6% and 42.6%. BtkKY27 has no evidence of 
a putative ORF downstream of the N gene, but possesses a short 
untranslated region and poly-A tail similar to Bat-CoV 1A (Chuetal., 
2008). BtKY22, BtKY33 and BtKY43 have one small putative ORF 
(76-161 amino acids (aa)) downstream of the N with no significant 
homology to previously described CoV ORFs. BtKY06 and BtkKY24 
have two small putative ORFs downstream of the N with sequence 
similarity to NS7a and NS7b in Bat-CoV HKUS9, respectively (Woo 
et al., 2007). BtkKY41 has two small putative ORFs downstream 
of the N, which are overlapped and have no significant sequence 
homology to the previously described ORFs. 

Like most alphacoronaviruses, the BtKY27, BtKY33, BtKY22, 
BtKY41, and BtKY43 viruses share a core sequence 5’-CUAAAC-3’ or 
similar putative transcription regulatory sequence (TRS) upstream 
of ORFs S, M, N, and ORFx and ORFy (Table 1) (Chu et al., 2008; Woo 
et al., 2005). ORF3 and E have putative core TRSs that sometimes 
varied from that for the other ORFs. The BtKY06 and BtKY24 have 
a core sequence TRS 5’-ACGAAC-3’ in the upstream of each ORF 
except E which has a core sequence TRS 5’-UCGAAC-3’ (Table 1). 

Spike proteins are the type I glycosylated membrane proteins, 
with a putative signal peptide at the N terminal. There are 31, 27, 


Genus Kenya bat CoV % identity to nearest known CoV? 
3’ genome? Nsp12° Nsp13° Nsp14° Nsp15° Nsp16° Se EY M‘ NC‘ 
Bat-CoV 1A 
BtkY27 85 97 96 94 95 95 87 91 93 91 
Bat-CoV 1A 
BtkY33 75 93 91 90 87 93 62 65 75 69 
Alphacoronavirus a at 
BtkY22 71 86 88 80 75 87 56 70 79 58 
Bat-CoV/512/05 
BtkY41 69 80 86 79 75 86 55 63 70 57 
Bat-CoV HKU8 
BtkY43 69 84 88 77 71 80 53 49 73 D2 
Bat-CoV HKU9 
: BtkY0O6 90 >99 99 99 99 86 83 97 95 94 
Betacoronavirus Bat-CoV HKU 
BtkY24 70 87 88 82 69 79 52 57 66 66 


2 The nearest known CoV species were chosen based on the blast search. 
b 3’ 15-16k genome nucleotide identity. 
© Amino acid identity. 
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92 (- Bat-CoV 1A 
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Fig. 3. Phylogenetic analysis of ORF1b, S, M and N of bat CoVs from Kenya. The unrooted trees are constructed by Maximum likelihood method with 1000 bootstrap replications after ambiguous regions from alignments of ORF1b, 
S, M, and N are removed. The seven Kenya CoVs are highlighted with solid circles. The genus taxonomy information is shown to the right side of the phylogeny. The maximum likelihood bootstrap is indicated next to the nodes. 
The scale bar indicates the estimated number of nucleotide substitutions per site. 
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28, 25, 31, 20, and 19 potential N-glycosylation sites in BtKY22, 
BtKY27, BtKY33, BtKY41, BtKY43, BtkKY24, and BtKYO6, respectively. 
As shown in Fig. 2, spike proteins of the seven bat CoVs lack furin 
protease recognition site, such as RRADR-S in Murine Hepatitis 
Virus (MHV), RRSRG-A in human CoV OC43 (HCoV OC43), RRSRR-A 
in bovine CoV (BCoV) (Follis et al., 2006), and cathepsin L cleavage 
site (VAYT-M) as in SARS-CoV (Bosch et al., 2008). In spite of lacking 
conserved cleavage sites, they all consist of two domains, S1 and S2, 
showing the conserved GxCx motif in S1 around the cleavage site 
and the conserved nonamer motif IPTNFSISI or similar motif in S2. 
These motifs have been observed in other known CoVs (Follis et al., 
2006). The S1 is responsible for virus binding to the receptor on the 
target cells and may contain receptor binding domains (RBDs) that 
directly bind to host cellular receptors. For example, the RBDs of 
HCoV 229E, TGEV, and HCoV NL63 in Alphacoronavirus are mapped 
at the C terminus of their S1 domain (Bonavia et al., 2003; Godet 
et al., 1994; Lin et al., 2008). The RBDs of MHV and SARS-CoV in 
Betacoronavirus are mapped at N terminus and central region of S1 
domain, respectively (Li et al., 2005; Lin et al., 2008). Alignment of 
aa sequences of S1 regions from BtKY22, BtKY27, BtKY33, BtkKY41, 
and BtkY43 of Alphacoronavirus with the corresponding known RBD 
S1 regions of HCoV 229E, TGEV, and HCoV NL 63 showed 33-41% 
identity in S1 RBD domains to HCoV 229E and 24-29% identity to 
TGEV and HCoV NL63 (Fig. S1A-C). BtKY24 and BtKY06 from Beta- 
coronavirus are quite different in the corresponding RBD S1 regions 
from SARS-CoV and MHV (17-19% identity) (Fig. S1D-E). The dis- 
similarity of S1 regions of these bat CoVs to other CoVs may suggest 
their different host specificity. 


3.3. Phylogeny 


We constructed phylogenetic trees using maximum likelihood 
method based on nt sequences of ORF1b, S, E, M and N genes 
with representative viruses whose corresponding sequences of 
their genomes were available (Fig. 3). The phylogeny of E gene is 
not shown due to the short length and limited value for inferring 
species phylogenies. Similar topologies were observed in the phylo- 
genetic trees based on each of 5 ORFs (Fig. 3). The analysis revealed 
that among the seven bat CoVs, five belonged to Alphacoronavirus 
while the other two belonged to Betacoronavirus (Fig. 3). Phylo- 
genetic clusterings within Alphacoronavirus varied slightly when 
different genes were analyzed. For example, BtKY22 and BtKY43 
grouped into one monophyletic clade in ORF1b tree while they 
were grouped differently in the S and N gene trees with gener- 
ally insignificant bootstrap values (Fig. 3). Although recombination 
was suspected, we found no evidence of recombination in the seven 
analyzed viruses using MaxChi (Smith, 1992), Chimaera (Posada 
et al., 2002), and Geneconv(Padidam etal., 1999). Since the analyses 
were based on representatives from each CoV species, the results 
suggest a lack of inter-species recombination in these viruses. One 
explanation is that the recombination frequency decreases signif- 
icantly when the sequence divergence is high (Kleiboeker et al., 
2005; van Vugt et al., 2001). Alternatively, the lack of inter-specie 
recombination is due to rare co-infections as the viruses adapted 
to different bats species. Therefore, the phylogenetic incongruence 
observed in the gene trees is probably due to low phylogenetic 
signals, which may be improved by sampling more CoVs that are 
related to BtKY22 and BtKY43. 

The pairwise nt comparisons among these seven bat CoV gene 
sequences revealed 67-76% overall nt identity. Among the five 
alphacoronaviruses, three (BtKY22, BtKY41 and BtKY43) were 
distantly related to other known alphacoronaviruses with only 
69-71% overall nt identity and with <90% aa identity in all five 
conserved domains (nsps 12-16) of ORF1b (Table 2). Since we 
were not able to obtain all the genome portions necessary for def- 
inite species classification (de Groot et al., 2011), we adopted the 


Separation criteria based on the RdRp group units (RGU) (Drexler 
et al., 2010). The aa distances in the 816bp fragment of the 
RdRp gene from the Kenya bat CoVs described in this study were 
compared to the aa sequences from their close reference viruses 
(Table S2). 

BtKY22, BtKY41, and BtKY43 had >4.8% aa distance in the RdRp 
fragment (Table S2). This suggests that they are most likely three 
distinctive alphacoronvirus species. BtKY27 and BtKY33 identified 
in Miniopterus bats were closely related to Bat-CoV 1A, which was 
identified from bent-winged Miniopterus bat in Hong Kong (Chu 
et al., 2006) with 85% and 75% overall nt identity and with >90% aa 
identity in 5/5 and 4/5 conserved domains (nsps 12-16) in ORF1b, 
respectively (Table 2). BtkKY27 and BtKY33 had <4.8% aa distance 
in the 816 bp Rdkp to their close reference viruses indicating that 
they are members of the established bat CoV species in Alphacoro- 
navirus. 

As for the two members of Betacoronavirus genus identified, one 
(BtkY06 identified in Rousettus aegyptiacus bat) was likely a mem- 
ber of Bat-CoV HKU9 species identified from Rousettus leschenaulti 
bat in China (Woo et al., 2007), sharing 90% overall nt identity 
and 99% aa identity in 4/5 conserved domains (nsps12-16) in 
ORF1b (Table 2). The other (BtKY24) was distantly related to other 
known betacoronaviruses with <70% overall nt identity and <90% 
aa identity in all 5 conserved domains (nsps 12-16) from ORF1b 
(Table 2). Additionally, based on the RGU criteria, BtKY24 had >6.3% 
aa distance in the 816bp RdRp fragment compared to its closest 
reference virus indicating that it is most likely a distinctive beta- 
coronavirus. 

In conclusion, sequence data for the structural and non- 
structural ORFs in the 3’-end of the genome of seven Kenya bat 
CoVs confirmed the high diversity and their phylogenetical place- 
ment into Alphacoronavirus and Betacoronavirus genera. The four 
clusters of Kenya bat CoVs represented by BtKY22, BtKY41, BtkY43, 
and BtKY24 respectively, most likely belonged to novel CoV species, 
the two clusters represented by BtkKY27 and BtKY33 were likely 
members of Bat-CoV 1A, and the cluster represented by BtKY06 
was likely a member of Bat-CoV HKU9 species. As noted with other 
novel CoVs, the genome organization is similar but differences were 
found in the number of putative ORFs downstream from the ORF 
N. The present results are in line with previous findings of exten- 
sive diversity of CoVs detected in bats and confirm that bat CoVs 
mainly belong to the Alphacoronavirus and Betacoronavirus genera 
(Lau et al., 2005, 2007; Tang et al., 2006; Woo et al., 2007, 2009b). 
Consistent with other reports, none of the bat CoVs characterized in 
the present study was sufficiently similar to the human SARS-CoV 
and other human CoVs to be suggested their direct progenitors. 
The examples of host switching among CoVs after relatively minor 
sequence changes in S1 domain of spike protein (Haijema et al., 
2003; Kuo et al., 2000; Qu et al., 2005) suggest the potential risks 
for introduction into humans as occurred with SARS-CoV. Therefore 
characterization of novel CoVs and understanding species diver- 
sity in animals should help understand and respond to emerging 
zoonotic infections. 
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